Let’s start by uploading all the neccessary packages.
suppressPackageStartupMessages(library(tidyverse))
## Warning: package 'tidyverse' was built under R version 3.5.1
## Warning: package 'ggplot2' was built under R version 3.5.1
## Warning: package 'tibble' was built under R version 3.5.1
## Warning: package 'tidyr' was built under R version 3.5.1
## Warning: package 'readr' was built under R version 3.5.1
## Warning: package 'purrr' was built under R version 3.5.1
## Warning: package 'dplyr' was built under R version 3.5.1
## Warning: package 'stringr' was built under R version 3.5.1
## Warning: package 'forcats' was built under R version 3.5.1
suppressPackageStartupMessages(library(gapminder))
## Warning: package 'gapminder' was built under R version 3.5.1
suppressPackageStartupMessages(library(RColorBrewer))
suppressPackageStartupMessages(library(scales))
## Warning: package 'scales' was built under R version 3.5.1
suppressPackageStartupMessages(library(plotly))
## Warning: package 'plotly' was built under R version 3.5.1
The goal of this component is to remake at least one figure.Then, make a new graph by converting this visual to a plotly graph.
Here is a scatterplot of gdpPercap and lifeExp for European countries between 1970 and 1994 I completed in homework 2:
gapminder %>%
filter(continent == "Europe") %>%
filter(year>1970 & year<1994) %>%
ggplot(aes(gdpPercap, lifeExp, color=year))+
geom_point()+
geom_smooth()
## Warning: package 'bindrcpp' was built under R version 3.5.1
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
I did the following to improve the scatterplot: 1. Created a new data file for the final plot.
2. Changed the selection parameters by removing the restrictions on years. 3. The size of points is now connected to the population size. 4. Made points somewhat transparent. 5. Added titles to the graph, renamed x-axis and y-axis. 6. Changed the colour of the smoothing funciton, made it transparent, changed the size of the line. 7. Added dollar sign format to x-axis values. 8. Used a new colour scheme for years: scale_colour_viridis_c. 9. Made sure the population size was reflected in a correct format on a side. 10. Added a new theme: theme_bw()
gapminder_Europe <- gapminder %>%
filter(continent == "Europe")
gapminder_Europe_gdpPercap <- gapminder_Europe %>%
ggplot(aes(gdpPercap, lifeExp, color=year, size=pop, alpha=0.2)) +
geom_point() +
labs(title = "Scatterplot of GDP per Capita and Life Expectancy (Europe)", x = "GDP per Capita",
y = "Average Life Expectancy") +
geom_smooth(fill="purple", alpha=0.3, size=0.7) +
scale_x_log10(labels=dollar_format()) +
scale_colour_viridis_c() +
scale_size_continuous(
labels = comma_format())+
theme_bw()
gapminder_Europe_gdpPercap
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
We can observe a strong linear relationship between gdpPercap and lifeExp.
Next, I wanted redo this plot for all other continents with countries whose gdpPercap is between 5,000 and 10,000.
gapminder_gdpPercap <- gapminder %>%
filter(gdpPercap>5000 & gdpPercap <10000) %>%
ggplot(aes(gdpPercap, lifeExp, color=year, size=pop, alpha=0.2)) +
geom_point() +
facet_wrap(~continent) +
geom_smooth(method='lm', size=0.5, alpha=0.2)+
labs(title = "Scatterplot of GDP per Capita and Life Expectancy",
x = "GDP per Capita",
y = "Average Life Expectancy") +
scale_x_log10(labels=dollar_format()) +
scale_colour_viridis_c() +
scale_size_continuous(
labels = comma_format())+
theme_bw()+
theme(strip.background = element_rect(fill = "yellow"),
strip.text = element_text(color = "black"),
axis.text = element_text(size=9))
gapminder_gdpPercap
We can notice here tat one of the continents (Oceania) doesn’t have data that satisfies this requirement so the whole continent is missing. There is also no longer a linear relationship between gdpPercap and lifeExp - this pattern holds true for all four continents.
Finally, let’s convert our two generated plots into the plotly format so we can interact with it.
Start with transforming gapminder_Europe_gdpPercap.
gapminder_Euro_plotly <- ggplotly(gapminder_Europe_gdpPercap)
## `geom_smooth()` using method = 'loess' and formula 'y ~ x'
gapminder_Euro_plotly
We can impove on this plot by adding a few changes to our code: 1. Year will appear as a moving format so color will be connected to the population size. 2. Country names will appear on each point.
gapminder_Europe_plotly <- gapminder_Europe %>%
ggplot(aes(gdpPercap, lifeExp, color=pop, alpha=0.2, frame = year)) +
geom_point(aes(size=pop, ids = country)) +
labs(title = "Scatterplot of GDP per Capita and Life Expectancy (Europe)", x = "GDP per Capita",
y = "Average Life Expectancy") +
geom_smooth(method="lm", fill="purple", alpha=0.15, size=0.7) +
scale_x_log10(labels=dollar_format()) +
scale_colour_viridis_c(trans = "log10",
breaks = 10^(1:10),
labels = comma_format()
) +
scale_size_continuous(
labels = comma_format())+
theme_bw()
## Warning: Ignoring unknown aesthetics: ids
gapminder_Europe_plotly <- ggplotly(gapminder_Europe_plotly )
gapminder_Europe_plotly
Similarly, let’s transform gapminder_gdpPercap into the plotly format.
gapminder_gdp_plotly <- ggplotly(gapminder_gdpPercap)
gapminder_gdp_plotly
If we can keep the year variable as a frame, we no longer have to restrict our observations to a small number of countries (yet We will exclude Oceania as it only has 2 countries).
gapminder_gdpPercap_plotly <- gapminder %>%
filter(continent!="Oceania") %>%
ggplot(aes(gdpPercap, lifeExp, color=pop, frame=year, alpha=0.2)) +
geom_point(aes(size=pop,ids=country)) +
facet_wrap(~continent) +
geom_smooth(method='lm', size=0.5, alpha=0.2)+
labs(title = "Scatterplot of GDP per Capita and Life Expectancy",
x = "GDP per Capita",
y = "Average Life Expectancy") +
scale_x_log10(labels=dollar_format()) +
scale_colour_viridis_c(trans = "log10",
breaks = 10^(1:10),
labels = comma_format()
) +
scale_size_continuous(
labels = comma_format())+
theme_bw()+
theme(strip.background = element_rect(fill = "yellow"),
strip.text = element_text(color = "black"),
axis.text = element_text(size=9))
## Warning: Ignoring unknown aesthetics: ids
gapminder_gdpPercap_plotly <- ggplotly(gapminder_gdpPercap_plotly)
gapminder_gdpPercap_plotly